Goto

Collaborating Authors

 pruning random forest


Pruning Random Forests for Prediction on a Budget

Neural Information Processing Systems

We propose to prune a random forest (RF) for resource-constrained prediction. We first construct a RF and then prune it to optimize expected feature cost & accuracy. We pose pruning RFs as a novel 0-1 integer program with linear constraints that encourages feature re-use. We establish total unimodularity of the constraint set to prove that the corresponding LP relaxation solves the original integer program. We then exploit connections to combinatorial optimization and develop an efficient primal-dual algorithm, scalable to large datasets. In contrast to our bottom-up approach, which benefits from good RF initialization, conventional methods are top-down acquiring features based on their utility value and is generally intractable, requiring heuristics. Empirically, our pruning algorithm outperforms existing state-of-the-art resource-constrained algorithms.


Reviews: Pruning Random Forests for Prediction on a Budget

Neural Information Processing Systems

The idea of taking into account feature costs when pruning tree ensembles is original to the best of my knowledge. The main originality of the proposed approach is the fact that it adopts a bottom-up post-pruning strategy, while most existing approaches are top-down, acting during tree growing. While the authors present this feature as an advantage of their method, actually, I'm not convinced that adopting a bottom-up strategy is a good idea for addressing this problem. Since the algorithm indeed can not modify the existing tree structure (it can only prune it), it should be less efficient in terms of feature cost reduction than top-down methods that can have a direct impact on the features selected at tree nodes. For example, let us assume that two very important features in the dataset carry on the exact same information about the output (i.e, they are redundant).


Pruning Random Forests for Prediction on a Budget

Nan, Feng, Wang, Joseph, Saligrama, Venkatesh

Neural Information Processing Systems

We propose to prune a random forest (RF) for resource-constrained prediction. We first construct a RF and then prune it to optimize expected feature cost & accuracy. We pose pruning RFs as a novel 0-1 integer program with linear constraints that encourages feature re-use. We establish total unimodularity of the constraint set to prove that the corresponding LP relaxation solves the original integer program. We then exploit connections to combinatorial optimization and develop an efficient primal-dual algorithm, scalable to large datasets.